Early Deletion of Fillers In Processing Conversational Speech
نویسندگان
چکیده
This paper evaluates the benefit of deleting fillers (e.g. you know, like) early in parsing conversational speech. Readability studies have shown that disfluencies (fillers and speech repairs) may be deleted from transcripts without compromising meaning (Jones et al., 2003), and deleting repairs prior to parsing has been shown to improve its accuracy (Charniak and Johnson, 2001). We explore whether this strategy of early deletion is also beneficial with regard to fillers. Reported experiments measure the effect of early deletion under in-domain and out-of-domain parser training conditions using a state-of-the-art parser (Charniak, 2000). While early deletion is found to yield only modest benefit for in-domain parsing, significant improvement is achieved for out-of-domain adaptation. This suggests a potentially broader role for disfluency modeling in adapting text-based tools for processing conversational speech.
منابع مشابه
Automatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech
Automatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech
متن کاملShould Agents Speak Like, um, Humans? The Use of Conversational Fillers by Virtual Agents
We describe the design and evaluation of an agent that uses the fillers um and uh in its speech. We describe an empirical study of human-human dialogue, analyzing gaze behavior during the production of fillers and use this data to develop a model of agent-based gaze behavior. We find that speakers are significantly more likely to gaze away from their dialogue partner while uttering fillers, esp...
متن کاملAn Improved Model for Recognizing Disfluencies in Conversational Speech
This paper presents a novel metadata extraction (MDE) system for automatically detecting edited words, fillers, and self-interruption points in conversational speech. Our edit word detection sub-system combines a Tree Adjoining Grammar (TAG) noisy channel model, a statistical syntactic language model, and a MaxEnt reranker. Hand-built, deterministic rules are used to detect fillers. Self-interr...
متن کاملA Model of Conversation Processing Based on Micro Conversational Events
I present a theory of discourse interpretation based on the hypothesis that the common ground of a conversation contains a record not only of complete speech acts, but, more in general, of each action of uttering a contribution to the conversation: single words, word fragments, and fillers. I call the action of uttering a ‘minimal’ contribution a MICRO CONVERSATIONAL EVENT. This model can serve...
متن کاملEthnomethodology and Conversational Analysis
In a speech community, people utilize their communicative competence which they have acquired from their society as part of their distinctive sociolinguistic identity. They negotiate and share meanings, because they have commonsense knowledge about the world, and have universal practical reasoning. Their commonsense knowledge is embodied in their language. Thus, not only does social life depend...
متن کامل